Write standalone --debug output to CSV file#15182
Write standalone --debug output to CSV file#15182OliverRietmann wants to merge 5 commits intoAliceO2Group:devfrom
Conversation
|
REQUEST FOR PRODUCTION RELEASES: This will add The following labels are available |
davidrohr
left a comment
There was a problem hiding this comment.
I like this and it makes a lot of sense. Still, I have some comments how it could be improved, please have a look.
| return 0; | ||
| } | ||
|
|
||
| namespace |
There was a problem hiding this comment.
I would propose we try to move additional functions to GPUReconstructionDebug.cxx, in order not to blow up GPUReconstructionGPU.cxx with timing and debug code.
| double kernelTotal = 0; | ||
| std::vector<double> kernelStepTimes(gpudatatypes::N_RECO_STEPS, 0.); | ||
|
|
||
| std::ofstream benchmarkCSV; |
There was a problem hiding this comment.
Also here, let's try to encapsulate timing code to GPUReconstructionDebug.cxx
| AddOption(deterministicGPUReconstruction, int32_t, -1, "", 0, "Make CPU and GPU debug output comparable (sort / skip concurrent parts), -1 = automatic if debugLevel >= 6 or deterministic compile flag set", def(1)) | ||
| AddOption(showOutputStat, bool, false, "", 0, "Print some track output statistics") | ||
| AddOption(runCompressionStatistics, bool, false, "compressionStat", 0, "Run statistics and verification for cluster compression") | ||
| AddOption(resetTimers, int8_t, 1, "", 0, "Reset timers every event") |
There was a problem hiding this comment.
If you change this here to 0, the timers will not be reset anymore, thus when debug mode is enabled in O2, timers will not be reset between time frames.
I think instead, I would leave the default at 1 here, and in standalone.cxx I would change the overriding to if (...runsInit > 0 && ...timingCSV == 0) rec->SetResetTimers(iRun < configStandalone.runsInit)
| kernelStepTimes[stepNum] += time; | ||
| } | ||
| char bandwidth[256] = ""; | ||
| Row task_row; |
There was a problem hiding this comment.
I would prefer a more compact form, couldn't you initialize all the task_row members in one line with an initializer list?
| if (kernelStepTimes[i] != 0. || mTimersRecoSteps[i].timerTotal.GetElapsedTime() != 0.) { | ||
| printf("Execution Time: Step : %11s %38s Time: %'10.0f us %64s ( Total Time : %'14.0f us, CPU Time : %'14.0f us, %'7.2fx )\n", "Tasks", | ||
| gpudatatypes::RECO_STEP_NAMES[i], kernelStepTimes[i] * 1000000 / mStatNEvents, "", mTimersRecoSteps[i].timerTotal.GetElapsedTime() * 1000000 / mStatNEvents, mTimersRecoSteps[i].timerCPU * 1000000 / mStatNEvents, mTimersRecoSteps[i].timerCPU / mTimersRecoSteps[i].timerTotal.GetElapsedTime()); | ||
| Row reco_step_row; |
There was a problem hiding this comment.
And actually, this is quite redundant. Also the benchmarkCSV.is_open()
How about instead we have a class debugTimingWriter, and then you just have one-liners here
timingWriter.write(kernelStepTimes[i], mTimersRecoSteps[i].timerCPU, mTimersRecoSteps[i].timerTotal.GetElapsedTime(), ...)
And inside you could do
if (writeCSV) {
...}
if (writeMarkdown) {
...
} else {
... classic style
}
Then whatever old parsing scripts still exist, are still compatible. and it will reduce the lines of code in GPUReconstructionCPU.cxx.
New Table Style Output